Goto

Collaborating Authors

 body part



Robust Pose Estimation in Crowded Scenes with Direct Pose-Level Inference

Neural Information Processing Systems

Multi-person pose estimation in crowded scenes is challenging because overlapping and occlusions make it difficult to detect person bounding boxes and infer pose cues from individual keypoints. To address those issues, this paper proposes a direct pose-level inference strategy that is free of bounding box detection and keypoint grouping. Instead of inferring individual keypoints, the Pose-level Inference Network (PINet) directly infers the complete pose cues for a person from his/her visible body parts. PINet first applies the Part-based Pose Generation (PPG) to infer multiple coarse poses for each person from his/her body parts. Those coarse poses are refined by the Pose Refinement module through incorporating pose priors, and finally are fused in the Pose Fusion module. PINet relies on discriminative body parts to differentiate overlapped persons, and applies visual body cues to infer the global pose cues. Experiments on several crowded scenes pose estimation benchmarks demonstrate the superiority of PINet. For instance, it achieves 59.8% AP on the OCHuman dataset, outperforming the recent works by a large margin.


MaskedManipulator: Versatile Whole-Body Manipulation

Tessler, Chen, Jiang, Yifeng, Coumans, Erwin, Luo, Zhengyi, Chechik, Gal, Peng, Xue Bin

arXiv.org Artificial Intelligence

We tackle the challenges of synthesizing versatile, physically simulated human motions for full-body object manipulation. Unlike prior methods that are focused on detailed motion tracking, trajectory following, or teleoperation, our framework enables users to specify versatile high-level objectives such as target object poses or body poses. To achieve this, we introduce MaskedManipulator, a generative control policy distilled from a tracking controller trained on large-scale human motion capture data. This two-stage learning process allows the system to perform complex interaction behaviors, while providing intuitive user control over both character and object motions. MaskedManipulator produces goal-directed manipulation behaviors that expand the scope of interactive animation systems beyond task-specific solutions.


Analysis of Deep-Learning Methods in an ISO/TS 15066-Compliant Human-Robot Safety Framework

Bricher, David, Mueller, Andreas

arXiv.org Artificial Intelligence

Over the last years collaborative robots have gained great success in manufacturing applications where human and robot work together in close proximity. However, current ISO/TS-15066-compliant implementations often limit the efficiency of collaborative tasks due to conservative speed restrictions. For this reason, this paper introduces a deep-learning-based human-robot-safety framework (HRSF) that aims at a dynamical adaptation of robot velocities depending on the separation distance between human and robot while respecting maximum biomechanical force and pressure limits. The applicability of the framework was investigated for four different deep learning approaches that can be used for human body extraction: human body recognition, human body segmentation, human pose estimation, and human body part segmentation. Unlike conventional industrial safety systems, the proposed HRSF differentiates individual human body parts from other objects, enabling optimized robot process execution. Experiments demonstrated a quantitative reduction in cycle time of up to 15% compared to conventional safety technology.


Funabot-Upper: McKibben Actuated Haptic Suit Inducing Kinesthetic Perceptions in Trunk, Shoulder, Elbow, and Wrist

Fukatsu, Haru, Yasuda, Ryoji, Funabora, Yuki, Doki, Shinji

arXiv.org Artificial Intelligence

This paper presents Funabot-Upper, a wearable haptic suit that enables users to perceive 14 upper-body motions, including those of the trunk, shoulder, elbow, and wrist. Inducing kinesthetic perception through wearable haptic devices has attracted attention, and various devices have been developed in the past. However, these have been limited to verifications on single body parts, and few have applied the same method to multiple body parts as well. In our previous study, we developed a technology that uses the contraction of artificial muscles to deform clothing in three dimensions. Using this technology, we developed a haptic suit that induces kinesthetic perception of 7 motions in multiple upper body. However, perceptual mixing caused by stimulating multiple human muscles has occurred between the shoulder and the elbow. In this paper, we established a new, simplified design policy and developed a novel haptic suit that induces kinesthetic perceptions in the trunk, shoulder, elbow, and wrist by stimulating joints and muscles independently. We experimentally demonstrated the induced kinesthetic perception and examined the relationship between stimulation and perceived kinesthetic perception under the new design policy. Experiments confirmed that Funabot-Upper successfully induces kinesthetic perception across multiple joints while reducing perceptual mixing observed in previous designs. The new suit improved recognition accuracy from 68.8% to 94.6% compared to the previous Funabot-Suit, demonstrating its superiority and potential for future haptic applications.



Erasing Undesirable Concepts in Diffusion Models with Adversarial Preservation

Neural Information Processing Systems

Diffusion models excel at generating visually striking content from text but can inadvertently produce undesirable or harmful content when trained on unfiltered internet data. A practical solution is to selectively removing target concepts from the model, but this may impact the remaining concepts. Prior approaches have tried to balance this by introducing a loss term to preserve neutral content or a regularization term to minimize changes in the model parameters, yet resolving this trade-off remains challenging.